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It is a fundamental consequence of the superposition principle 
for quantum states that there must exist non-orthogonal states, 
that is states that, although different, have a non-zero overlap. 
This finite overlap means that there is no way of determining 
with certainty in which of two such states a given physical 
system has been prepared. We review the various strategies 
that have been devised to discriminate optimally between 
non-orthogonal states and some of the optical experiments that 
have been performed to realise these. © 2008 Optical Society of 
America 
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1. Introduction 

The state of a quantum system is a mysterious object and has been the subject 
of much attention since the earliest days of quantum theory. We know that it 
provides a way of calculating the observed statistical properties of any desired 
observable but that it is not, itself observable. This means that we cannot deter- 
mine by observation the state of any single physical system. If we have some 
prior information, however, then we may be able to use this to determine, at 
least to some extent, the state. Consider, for example, a single photon which 
we know has been prepared with either horizontal or vertical polarisation. A 
suitably oriented polarising beam splitter can be used to transmit the photon if 
it was vertically polarised and reflect it if its polarisation was horizontal. De- 
termining the path of the photon by absorbing it with a suitable detector then 
determines the state to have been one of horizontal or vertical polarisation. 

Suppose, however, that we are told that our photon was prepared with either 
horizontal or with left-circular polarisation. These quantum states of polarisa- 
tion are not orthogonal in that states of circular polarisation are superpositions 



of those of both vertical and horizontal polarisation: 
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If we subject our photon to the polarisation measurement outlined in the pre- 
ceding paragraph then a left-circularly polarised photon will appear to be hor- 
izontally polarised with probability A and vertically polarised with the same 
probability. 

The problem of discriminating between such states is fundamental to the 
quantum theory of communications [1-4] and underlies the secrecy of the now 
well-reviewed science of quantum cryptography [5-9]. Indeed, we can use the 
connection between quantum state discrimination and quantum communica- 
tions to motivate the problem of state discrimination. We suppose that two 
parties, conventionally named Alice and Bob, wish to communicate using a 
quantum channel. To do this Alice selects from a given set of states, | y/,) (or 
more generally mixed states with density operators p 7 ) with a given set of prob- 
abilities pi. The selected state is encoded in the preparation of a given physical 
system, such as photon polarisation, and this is sent to Bob. Bob will know both 
the set of possible states and the associated preparation probabilities. His task 
is to determine, by means of a suitable measurement, the state selected by Alice 
and hence the intended message. This, then is the quantum state discrimination 
problem: how can we best discriminate among a known set of possible states 
| y/j), each having been prepared with a known probability pi. 

The quantum state discrimination problem, as posed here, has been the sub- 
ject of active theoretical investigation for a long time [1-3, 10-14], but is only 
comparatively recently that experiments have been performed and most of these 
have been based on optics. There exist in the literature a number of reviews of 
and introductions to quantum state discrimination [4, 15-20]. Our purpose in 
preparing this review is twofold: first to bring the rapidly developing field up 
to date and secondly to introduce the idea of state discrimination to a wider 
audience in optics. It seems especially appropriate to do this as it is in simple 
optical experiments that the ideas are most transparent and where most of the 
important practical developments have been made. 

2. Generalised measurements 

Most of us are introduced to the idea of measurements in quantum theory in a 
manner that is, essentially, that formulated by von Neumann [21]. Each observ- 
able property O is associated with a Hermitian operator O (or more precisely a 



self-adjoint one) the eigenvalues of which are the possible results of a measure- 
ment of O. If the eigenvalues and eigenvectors are o m and \o m ) then we can 
write the operator O in the diagonal form 

= Y^°m\o m ){Om\- (2) 

m 

If the system to be measured has been prepared in the state 1 l//) then the proba- 
bility that a measurement of O will give the result o m is 

P(o m ) = \(o m \y)\ 2 . (3) 

Consider, for example, a measurement to determine whether the polarisation of 
a single photon is horizontal or vertical. A suitable operator, corresponding to 
this measurement, would be 

Pol = H\H)(H\+V\V)(V\. (4) 

The probability that a measurement of this property on a photon prepared in the 
circularly polarised state \L) will give the result H, corresponding to horizontal 
polarisation, is 

P(H) = \(H\L}\ 2 = 1 -. (5) 

It is helpful, in what follows, to rewrite the above probabilities as the expec- 
tation value of an operator. In this way the probability that a measurement of 
optical polarisation shows the photon to be horizontally polarised is 

P(H) = (\H)(H\) = (P H ), (6) 

where P# = \H)(H\, the projector onto the state \H), is the required operator. 
More generally, for our operator O, the probability that a measurement gives 
the value o m is 

P(o m ) = (\o m )(o m \) = (P m ). (7) 

We note that the measured value, itself, makes no explicit appearance in this 
probability; it is not the eigenvalue but only the corresponding eigenvector that 
determines the form of the projector and hence the probability for the associated 
measurement outcome. 

The projectors have four important mathematical properties and it is helpful 
to list these: 

• The projectors are Hermitian operators, P} n = P m . This property is associ- 
ated with the fact that probabilities are, themselves, observable quantities. 



• They are positive operators, which means that (y/j.P m |y) > for all pos- 
sible states \\ff). This reflects the fact that the expectation value of the 
projector is a probability and must, therefore, be positive or zero. 

• They are complete in that £ m A« = 1, so that the sum of the probabilities 
for all possible measurement outcomes is unity. 

• They are orthonormal in that P m P n = unless m = n. This property is 
sometimes associated with the fact that measurement outcomes must be 
distinct (you can only get one of them). This view is, as we shall see, not 
correct. You can indeed only get one outcome but this does not require 
the orthonormality property. 

The theory of generalised measurements can be formulated simply by drop- 
ping the final requirement. To see how this works, we introduce a set of proba- 
bility operators {ft m }, each of which we wish to associate with a measurement 
outcome such that the probability that our measurement gives the result labeled 
m is 

P(m) = (7t m ). (8) 

We insist on the first three of the properties of the projectors, as these are re- 
quired if we are to maintain the probability interpretation (8), but drop the final 
requirement so that our probability operators have the properties: 

• The probability operators are Hermitian: ft? n = ft m . 

• They are positive operators: (i/A|7T m |y) > for all possible states 

• They are complete: E m 7T m = 1. 

The set of probability operators characterising the possible outcomes of any 
generalised measurement is called a probability operator measure, usually ab- 
breviated to POM [1,4]. You will often find this set referred to as a positive 
operator- valued measure or POVM [22, 23]. If the latter name is used then the 
probability operators become elements of a POVM. 

The differences between the projectors and more general probability opera- 
tors are best appreciated by reference to some simple examples and these will 
be given in the following sections. There are, however, some important and per- 
haps even surprising points and it is sensible to emphasise these here. Firstly, 
the three properties described above have a remarkable generality in that (i) 
any measurement can be described by the appropriate set of probability oper- 
ators and (ii) any set of operators that satisfy the three properties correspond 
to a possible measurement [4,22]. This means that we can seek the optimum 



measurement in any given situation mathematically, by searching among all sets 
of operators that satisfy the required properties. Having found this we know that 
a physical realisation of this will exist and can seek a way to implement it in the 
laboratory. The second point to emphasise is that the number of (orthogonal) 
projectors can only be less than or equal to the dimension of the state space. 
For optical polarisation, for example, there are only two orthogonal polarisa- 
tions and the state space is therefore two-dimensional. It follows that any von 
Neumann measurement of polarisation can only have two outcomes. By drop- 
ping the requirement for orthogonality, we allow a generalised measurement 
to have any number of outcomes, so a generalised measurement of polarisa- 
tion can have three, four or more different outcomes. Finally, a generalised 
measurement allows us to describe the simultaneous observation of incompati- 
ble observables, such as position and momentum or, in the context of quantum 
optics, orthogonal field quadratures [24,25]. Perhaps the first reported gener- 
alised optical measurement was of precisely this form [26,27]. 

3. State Discrimination -Theory 

3. 1. Mimimum Error Discrimination 

In quantum state discrimination we wish to design a measurement to distinguish 
optimally between a given set of states. As we have seen in the previous section, 
any physically realisable measurement can be described by a probability oper- 
ator measure. Thus by mathematically formulating a figure of merit describing 
the performance of a measurement, we can search for the set of probability op- 
erators describing the optimal measurement. There are several possible figures 
of merit, each one corresponding to a different strategy. Possibly the simplest 
criteria which may be applied is to minimise the probability of making an error 
in identifying the state. We begin with the special case where the state is known 
to be one of two possible pure states, \\j/o), with associated probabilities 
Po, Pi = 1 — Po- If outcome 0, associated with the probability operator tcq is 
taken to indicate that the state was |y/b)> an d outcome 1 (associated with ft\) is 
taken to indicate that the state was | yf\ ) , the probability of making an error in 
determining the state is given by 

Perr = P(VoW I V*) +^(Vl)P(0| Wl) 

= J Po(V / o|^i|V / o)+Pi(V / il^o|V / i) 

= p -Tr((p |V / o)(V / ol -PilViXViD^o), (9) 

where in the last line we have used the completeness condition tcq + ft\ = t. 
This expression takes its minimum value when the second term reaches a maxi- 
mum, which in turn is achieved if 7Cq is a projector onto the positive eigenket of 




Fig. 1. The optimal minimum error measurement for discriminating be- 
tween the pure states |i//b)> |Vl) is a von Neumann measurement. For 
Po = Pi = 1/2 this is a projective measurement onto the states \(po), 
symmetrically located either side of the signal states, and shown in blue 
here. For po > p\ the optimal measurement performs better when state | Yo) 
is sent, shown here in light blue (labeled |0q), is the case po = 3/4. 

the operator po I Vo) ( Vo I — 1 1 V ) ( Vi I • Note that two pure states define a two- 
dimensional space, and without loss of generality we can choose an orthogonal 
basis {|0), 1 1)} of this space such that the components of each state in this basis 
are real. Thus we can express |i// ), | Vi) as follows 

|Vb) = cos0|O) + sin 1 1 ) 

= cos0|O)-sin0|l), (10) 

and the eigenvalues of po| Vb) ( Vol — Pi I Vi) ( Vi I can t> e calculated directly as 
A± = l - (pq - pi ± a/1 - Wi cos 2 20) . (11) 

The minimum probability of making an error is then given by the so-called 
Helstrom bound [1] 

Pen = \ (l - - Wl I ( Vol Vl> I 2 ) , (12) 



and the optimal measurement is simply a von Neumann measurement. In partic- 
ular, for po = pi = 1/2, the optimal measurement is a projective measurement 
onto the states 

lfo> = ^=(|o> + |i» 

l*i> = ^(|0>-|1». (13) 

These are symmetrically located about the signal states, as may be expected 
from the symmetry of the problem. As po is increased, 1 0o) moves closer to 
| yib), and the optimal measurement becomes biased towards making less errors 
when the more probable state is sent (see Fig. 1). Finally, as may be expected 
intuitively, if po is much bigger than pi, the optimal measurement is very close 
to simply asking "is the state | y/b) or not?" 

3.1. a. The minimum error conditions 

The above analysis is easily extended to two mixed states po, pi, in which 
case the optimal measurement becomes a projective measurement on to the 
subspaces corresponding to positive and negative eigenvalues of poPo — PiPi- 
In the general case of N possible states {p,} with associated a priori probabilies 
{pi}, the aim is to minimise the expression 

N-l 

Perr= £ Pi £ Tr , (14) 

i=0 fri 

or equivalently to maximise 

N-l 

Pcorr = 1 " Perr = £ PiTr (frfy . (15) 
;=0 

The optimal measurement is known only in certain special cases, however nec- 
essary and sufficient conditions which must be satisfied by the optimal POM 
for the general case are known [1, 12, 13] and are given in equations (16,17). 
For simplicity, we prove only sufficiency of the conditions here, but we note 
that there is also a straight-forward proof of their necessity [28]. 



Necessary and sufficient conditions which must be satisfied by the 
POM achieving minimum error in distinguishing between the states 
{pi}, occuring with probabilities {/?,} are given by 

Y^PipiTti-pjpj > 0, V j (16) 

i 

fti{Pipi-Pjpj)Kj = 0, V ij. (17) 

Note that these conditions are not independent, the second may be 
derived from the first, as shown in the text. 



If {ftj} corresponds to an optimal measurement, then for all other POMs {ft-} 
we require 

I> Tr (/W > X>yTr(A#J) (18) 

i j 

Inserting the identity n'j — t gives 

L Tr ( \Y,PiPiKi ~ PjPjj Zjj >0- d9) 

Note that ftj > 0, thus the above holds if equation (16) holds, which is therefore 
a sufficient condition. 

For any POM satisfying this condition, it follows that the operator T = 
Y,i Pipifti is positive, and therefore Hermitian. Thus we have 

£ (VpiTtipi - Pjpj^j ftj = £ Tti (piPi - Y^Pjpjftj^j = 0, (20) 

where we have used the fact that the probability operators form a resolution of 
the identity = 1. As both YsiPfiiPi — PjPj an d ftj are positive operators, 
each term in the sum over j must be identically zero. Using similar reasoning 
we can argue that each term in the sum over i must be identically zero. Thus, in 
terms of T we obtain 

{t-pjPj)Tt ] = ft l {p i p i -t)=Q, VU. (21) 

Eliminating f gives equation (17), which is therefore also a sufficient condition. 

3.1 .b. Square Root Measurement 

For any given set of states we can construct an associated measurement, the 
square root measurement [29-32], as follows 

7t i = p l p- l,2 p i p- 112 (22) 



where p = Y^iPiPi- It is clear that the probability operators {#,■} are positive 
and sum to the identity, and thus form a complete measurement. For many of 
the cases in which the optimal minimum error measurement is known, it is the 
square root measurement [33-38]. We will present here the example of N sym- 
metric pure states occuring with equal a priori probabilities Pi = h, considered 
by Ban et al [33], and given by 

I Wi) = V | m- 1 ) = V'\ Wo) , i = 0, • . . ,N - 1 (23) 
for some unitary operator V satisfying V N = 1. For this set we have 

i N-l i N-l 

P = v I IWXWI = 77 I V>iVo>(Vo|t> t; (24) 



ly i=0 ly (=0 



and it is useful to note that 



= kf=o 1 ^ + Vo)(^oit> ti ' +i r25 , 

= p 

where in the last line we have used the property V N = 1. Thus 

= ypy V = ^ y (26) 

and p commutes with V\ The square root measurement consists of the operators 

* = ^- 1/2 \Yi)(Yi\r 1/2 = ^p- l/2 V'\¥o)(Wo\V fi p- l/ \ (27) 
and condition (17) is equivalent to the requirement 

(^ir 1/2 i^)(^ir 1/2 i^)-(^ir 1/2 i^)(^ip _1/ V;)=o. (28) 

Noting that 

(Wi\r 1/2 \¥i) = (Vbl^>" 1/2 ^ + 'Vo) = (Vo|/r 1/2 |Vo>, Vi, (29) 
we see that this requirement is satisfied. We now proceed to evaluate f 

f = ^to\wi){¥i\h^ 1/2 \¥i}(¥i\p- 1/2 

UYo\^ 1/2 \¥o)ltoh\¥i)(¥i\p- 1/2 (30) 



= U¥o\p- l/2 \¥o)P 1/2 



To satisfy condition (16) we require 



(0nf-i|^)( W ni0)>o, vi,i*>. (3d 

Writing f = ^(V / /|p~ 1 ' /2 | Vi)p^ 2 we can show 

(0|f|0> = ^<v#- 1/2 Iw><0Ip 1/2 I0) 

> ^I(^-Ip- 1/4 P 1/4 I0)I 2 ( 3 

= ^Kv^>l 2 , 

where we have used the Cauchy-Schwarz inequality. Thus condition (16) holds, 
and the square root measurement is optimal. Note that the case of two equiprob- 
able pure states discussed above is an example of a symmetric set. In this case 
U = d z , and it may easily be verified that ov|ty/o) = I Vi) an d of = 1- Another 
example of a symmetric set is the so-called trine ensemble [31,39], given by 

IVo) 

\¥2) 

and obtained from one another by rotation through an angle of These states 
form a resolution of the identity, and the square root measurement consists of 
equally weighted projectors onto the states themselves, % = ^{wdiYil- 

The above solution has been extended to multiply symmetric states [36] and 
mixed states [37, 38]. The square root measurement has also been generalised 
by Mochon [40], who considered measurements of the form 

ft i = a- l l 2 p' l p l d- x l\ (34) 

where d = Y*iP'iPu i- e - the square-root measurements corresponding to the same 
set of states but constructed using different a priori probabilities. For pure 
states, each such measurement is optimal for at least one discrimination prob- 
lem with the same states, occuring with probabilities given analytically in [40]. 

3.1.C. Other Results 

Most of the known results for minimum error discrimination correspond to one 
of the two cases discussed above: that of just two states, or those for which 
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x/3, 



1) 



(33) 



the square-root measurement is optimal. Another example which is interest- 
ing to note is the no-measurement strategy [41]. Sometimes the optimal dis- 
crimination strategy is not to measure at all, but just to guess the state which 
is a priori most likely, a measurement which may be described by the POM 
{ftt = 1, Hj = 0, V j ^ z}, where i is such that pt > pj, V j. Condition (17) holds 
trivially for this POM. Thus the no measurement solution is optimal when con- 
dition (16) holds, which then reads 

Pifr-pjPj>0, Vj. (35) 

Clearly this is never optimal if p ; is pure; a necessary (but not sufficient) con- 
dition is that the eigenvectors of p ; span the entire Hilbert space in which the 
states {pj} lie. A practical example is discriminating signal states from random 
noise, described by the density operator p ; - oc i. If the signal to noise ratio is 
small enough, the minimum error strategy is to always guess that the state re- 
ceived was random noise [41]. It is therefore useful to know the noise threshold 
at which this occurs, which may be deduced from the condition (35). 

Other examples for which explicit results are known include three mirror 
symmetric qubit states, both for pure [42], and mixed states [43], and the case 
of equi-probable pure states, a weighted sum of which equals the identity op- 
erator [13]. The form of the solution for any set of qubit states has also been 
explored in some detail by Hunter [44,45], including a complete characterisa- 
tion of the solution for equiprobable pure qubit states. In the general case, for 
which explicit results are not known, it is possible to deduce both upper [46,47], 
and lower [48,49] bounds on the error probability. Alternatively, numerical al- 
gorithms exist which can find the optimal measurement for a specified set of 
states to within any desired accuracy [50,51]. 

3.2. Unambiguous Discrimination 

In the minimum error measurement, each possible outcome is taken to indicate 
some corresponding state. It is perhaps surprising that it is sometimes advanta- 
geous to allow for measurement outcomes which don't lead us to identify any 
state. Suppose again that we wish to discriminate between the two pure states 
given by equation (10), occuring with a priori probabilities po, p\. Consider 
the von Neumann measurement 

ft, = 

Kq = (sin0|O)+cos0|l))(sin0(O|+cos0(l|). (36) 

If outcome ?, associated with the probability operator ft-y is realised, we cannot 
say for sure what state was prepared. However, note that (i//i|7To|v/i) =0, and 
thus when outcome 0, corresponding to POM element Kq, is realised, we can say 



for certain that the state was | i/Aq) . Thus, by allowing for measurement outcome 
?, which does not lead us to identify any state, we can construct a measure- 
ment which sometimes allows us to determine unambiguously which state was 
prepared. This measurement however only ever identifies state | xj/q), ideally we 
would like to design a measurement which can identify either state unambigu- 
ously, at the cost of sometimes giving an inconclusive result. The generalised 
measurement formalism outlined above allows for exactly such a measurement, 
a possibility that was first pointed out in the seminal papers of Ivanovic [52], 
Dieks [53], and Peres [54]. 

Consider therefore the operators 

Kq = a o (sin0|O)+cos0|l))(sin0(O|+cos0(l|) 

ft\ = fli(sin0|O)-cos0|l))(sin0(O|-cos0(l|), (37) 

chosen such that (yo|#i|Vb) = (Vil^ol Vi) = 0, and where < ao,a\ < 1. Thus 
when outcome is realised, we can say for sure that the corresponding state was 
| Vb)> while when outcome 1 occurs, we know the state was | \f/i} with certainty. 
Note that these cannot form a complete measurement for any choice of ao, a\, 
unless |yb)> Iv^i) arQ orthogonal, and thus an inconclusive outcome is needed, 
associated with the probability operator 

7T? = l-TTo-^l- (38) 

The probability of occurrence of the inconclusive result is given by 

P(?) = j po(V / b|n?|vA )+ j pi(v/i|n ? |v/i) = l-sin 2 20(p o «o + J Pi«i), (39) 

and the unambiguous discrimination strategy may be further optimised by min- 
imising this probability, subject to the constraints ciQ,ai > 0, ft, > 0. For equal 
a priori probabilities, po = pi = j, the minimum value or IDP limit [52-54] is 
given by P(?) = cos 20 = | (Vol Vi) I an d is achieved by the measurement 

fto = - * (sin6|0)+cos6|l))(sin6(0|+cos6(l|) 
2 cos 2 6 

Tti = - * (sine|0)-cose|l))(smfl(0|-cose(l|), 
2 cos 2 6 

ft, = (l-tan 2 0)|O)(O|. (40) 

For unequal prior probabilities [55], as po is increased, the optimal measure- 
ment is given by equations (37,38) with 

l- A /^cos20 

ao = » 

sin 2 2e 



giving P(?) = ly/popi cos20. Thus the measurement becomes biased towards 
unambiguously identifying the state which is a priori more probable. Clearly 
when <J~^cos26 > 1 this no longer defines a physical measurement; the op- 
timal measurement then is simply the von Neumann measurement given by 
equation (36). In this case always gives the inconclusive result, and the 
probability of failure is P{1) = po\ (i//q| ty/i) | 2 + p\. Thus for po much bigger 
than pi, the optimal strategy is the one which rules out the less probable state, 
in contrast to the minimum error measurement, which in this regime (approxi- 
mately) identifies or rules out the more probable state. 

A simple example from quantum optics might help to illustrate the main 
idea [56]. Let us suppose that we have an optical pulse known to have been 
prepared, with equal probability, in one the two coherent states [57] \a) or 
| — a). If we interfere the pulse with a second pulse prepared in the state \ia) 
using a 50:50 symmetric beam splitter then one of the output modes will be left 
in its vacuum state |0): 

\a)\ia) -> \0)\iV2a) 
\-a)\ia) -> |-V2a)|0). (42) 

The state can be identified simply by detecting the light in the associated output 
mode. The ambiguous outcome is a consequence of the fact that the coherent 
states have a non-zero overlap with the vacuum state, and the probability for 
this result is 

PT = \(iV2a\0}\ 2 =\{-V2a\0}\ 2 =\{a\-a}\, (43) 
which is the IDP limit. 



3.2.a. N > 2 Pure States 

In the general case of discriminating unambiguously between N pure states 
{| y//)}, i = 0, . . . , /V— 1, we wish to find probability operators {ft/} such that 

( ¥i \Aj\\ifi)=Pi8ij (44) 

where < P, < 1. Thus outcome j is obtained only if the state is in which 
case it occurs with probability Pj. We first note that this is only possible if the 
states {|Vz)} are linearly independent, as was shown by Chefles [58]. When this 
is the case, we can construct states \ such that 

(Wlv/) = {Vj\vf)Sij, (45) 



i.e. \ yf) is orthogonal to all allowed states except The POM elements 

thus satisfy equation (44), and unambiguously discriminate between the lin- 
early independent states {| V/)}- As before, an inconclusive outcome is neces- 
sary to form a complete measurement 

£ ? = (47) 

j 

The above defines the unambiguous discrimination strategy for N linearly in- 
dependent states. The occurrence of outcome j indicates unambiguously that 
the state was \ijfj). As in the two state case, a further condition which may be 
applied is to minimise the probability of obtaining an inconclusive result. Ana- 
lytical solutions for the minimum achievable P(?) are not known in the general 
case, but the solution for three states is given by Peres and Terno [59], who 
also discuss how the method used can be extended to higher dimensions. For 
the special case in which the probability of unambiguously identifying a state 
| is the same for all j (Pj = P, V j) the minimum probability of obtaining 
an inconclusive result is known [58]. Further, the optimal strategy minimising 
this probability is given for N linearly independent symmetric states in [60]. 
For the general case, upper [61] and lower bounds [62, 63] have been given for 
the probability of successful unambiguous discrimination of /V linearly inde- 
pendent states, and numerical optimisation techniques have also been consid- 
ered [63,64]. 

3.2.b. Mixed States 

It is only relatively recently that unambiguous discrimination has been extended 
to mixed states [65], where it may be applied to problems such as quantum 
state comparison [65,66], subset discrimination [67], and determining whether 
a given state is pure or mixed [68]. Consider the problem of discriminating 
between two mixed states po, pi, which may be written in terms of their eigen- 
values and eigenvectors as follows 



Po = L-A/ 0) |A/°))(Af | 
Pi = YutfWMW 



(48) 



where < < 1 . Define the projectors 



a£> = i-E,-i«°»i 



(49) 



* (0 ) /v 'Ml) A 

such that A K ke ' r pQ = Ai/pi = 0. These are the projectors onto the kernels of 
/$o and /3i respectively 1 . If we now define 7t\ to lie in the kernel of p\) then 
*i=A2*iAS and clearly 

Tr(A)£l) = Tr(poAS>iA^) = 0. (50) 

Thus if there exists a positive operator ft\ in the kernel of Pq for which 
Tr(pi7Ti) 7^ 0, then pi may be unambiguously discriminated from p\). Similarly 
Kq should lie in the kernel of p\. Thus a necessary and sufficient condition for 
unambiguous discrimination between two mixed states is that they have non- 
identical kernels, and thus non-identical supports [65]. Unless the states are or- 
thogonal an inconclusive outcome will be needed, as before, tt? = 1 — fto — ft\ . 
The problem of finding the strategy which minimises the probability of oc- 
currence of the inconclusive result is again a difficult one, and one which has 
received much attention in the past few years. The solutions for some special 
cases are known, some examples are when both states have one-dimensional 
kernels [65], unambiguous discrimination between a pure and a mixed state, 
firstly in two dimensions [69], and later extended to N dimensions [70]; other 
examples may be found in [71-73]. Reduction theorems given in [74] show that 
it is always possible to reduce the general problem to one of discriminating two 
states each of rank r, which together span a 2r-dimensional space. Thus the 
simplest case which is not reducible to pure state discrimination is the problem 
of two rank-2 density operators in a 4-dimensional space, which was recently 
analysed in detail by Kleinmann et al [75]. Upper and lower bounds for the 
general case are given in [65,76,77], a further reduction theorem in [72], and 
numerical algorithms are discussed in [78]. 

3.3. Maximum confidence measurements 

As pointed out in the previous section, unambiguous discrimination is possible 
only when the allowed states are all linearly independent. If this is not the case, 
there will always be errors associated with identifying some states, even if an 
inconclusive outcome is allowed. Nevertheless, we can construct an analogous 
measurement, one which allows us to be as confident as possible that when 
the outcome of measurement leads us to identify a given state |y/|-), that was 
indeed the state prepared [79]. Just as with unambiguous discrimination, this 
measurement is concerned with optimising the information given about the state 
by particular measurement outcomes, specifically the posterior probabilities 

P(A l = Wp>. (51) 

'The support of a mixed state p is the subspace spanned by its eigenvectors with non-zero 
eigenvalues. The kernel of a mixed state is the subspace orthogonal to its support. 



Physically, in many runs of an experiment, this probability refers to the propor- 
tion of occurrences of outcome i which were due to state p,-. In a single-shot 
measurement, this therefore corresponds to the probability that it was state p,- 
that gave rise to outcome i. Thus, we can think of this quantity as our confidence 
in taking outcome i to indicate state | In terms of the probability operator 7T,- 
associated with outcome i, we can write 

(m _ P,Tr(M) (52) 
lP,| ' J " Tr(p^) ' ° Z) 

where p = Y^jPjPj is the a priori density operator. We note that 7tj appears in 
both the numerator and denominator of this expression, and thus can only be 
determined up to a multiplicative constant. It is always possible, therefore, to 
choose the overall normalisation such that 

I><1, (53) 

and a physically realisable measurement may be constructed by adding an in- 
conclusive result. Thus the only constraint we need worry about is that ftj > 0. 
Optimisation of this figure of merit is greatly facilitated by the use of the ansatz 

fc l = p- l,2 Q l p- l, \ (54) 
where, by construction, % > if Qi > 0. With this, equation (52) becomes 

where we have used the cyclical property of the trace. Note that <2;7Tr(<2,-) is a 
positive, trace one operator, and so has the mathematical properties of a density 
operator. The density operator which has largest overlap with any operator A is 
simply a projector onto the largest eigenvector of A (or any density operator in 
the eigensubspace corresponding to the largest eigenvalue if this is degenerate). 
For pure states the optimal probability operators are therefore given by 

ni<^p 'pip (56) 

while for mixed states they may be written 

^■ocp-l/2^-1/2 (5?) 

where dt is any density operator lying in the eigensubspace of p~ 1//2 />;p ; p -1 / 2 
corresponding to its largest eigenvalue. Finally, the limit is given by 

[P(p:\i)]ma X = Ymax [p^ ' 2 pM' 1 ,2 ) (58) 
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Fig. 2. Bloch sphere representation of states. The states used in the exam- 
ple, along with the states onto which the optimal maximum confidence and 
minimum error POM elements project are shown. ( [79], Copyright (2006) 
by the American Physical Society.) 

where JmaxiA) denotes the largest eigenvalue of A. 

The simplest non-trivial example of a set of linearly dependent states is that 
of three states in two dimensions. To illustrate this strategy we consider the 
problem of discriminating between the states 

|Vb) = cos0|O) + sin0|l), 

= cos0|O) + e 27r//3 sin0|l), (59) 
|Vfc) = cos0|O) + e- 27n '/ 3 sin 6 1 1 ) , 

where < 6 < n/4, occurring with equal a priori probabilities p, • = 1/3, i = 
0, 1,2. These states are symmetrically located at the same latitude of the Bloch 
sphere, as shown in Fig. 2. The a priori density operator for this set is 

p =cos 2 0|O)(O|+sin 2 0|l)(l|, (60) 

and the maximum confidence POM elements may be readily calculated using 



equation (56). These have the form % = a ; |0,)(0,|, where we have some free- 
dom in choosing the constants a ; , i = 0, 1 , 2, and 



|0o) = sin0|O)+cos0|l), 

|0i) = sin0|O)+e 27r//3 cos0|l), (61) 
\(j> 2 } = sin0|O)+^ 27r ' /3 cos0|l). 

These states correspond to reflections of the input states in the equatorial plane 
of the Bloch sphere, and are also shown in Fig. 2. It is not possible in general 
to choose OJo,ai,cu2 such that {fti} form a complete measurement, and thus 
an additional operator, 7T? = 1 — ftj, associated with an inconclusive result, 
is needed. We may choose to complete the measurement by minimising the 
probability of an inconclusive result 



/>(?) = Tr(p7T?) 

= 1 -2(ci5o + ai + a 2 )cos 2 0sin 2 0. 



2 fl ,„2 fl (62) 



As P(?) is a monotonically decreasing function of a ; , the optimal values of 
these parameters lie on the boundary of the allowed domain, defined by the 
constraint 7T? > 0. It is straightforward to show that this leads us to choose 
«o = OL\ = CC2 = (3cos 2 0) -1 , giving 

ir ? = (l-tan 2 0)|O)(O|. (63) 

It is useful to compare this measurement with the minimum error (ME) 
measurement, which for this set is given by the square root measurement dis- 
cussed earlier 

fif* = l -p- 1/2 \¥d(¥ t \p- l/2 = \\€ E ){^% (64) 



where 



K £ > = ^=(|o) + |i», 

IO = -L(|0)+ e 2 ^/3|i ))5 (65) 

l<fe M£ > = ^(|0)+e- 2 --/3|i )) . 

In the Bloch sphere representation, these states correspond to the projection of 
the input states onto the equatorial plane, as can be seen in Fig. 2. The minimum 
error and maximum confidence figures of merit are shown for each measure- 
ment in Fig. 3. For the minimum error measurement, each outcome leads us 



to identify some state, and the average probability of making an error is min- 
imised. However, the confidence in identifying a state may be increased by 
allowing for an inconclusive result, as may be seen from the plots. When a non- 
inconclusive result is obtained in the maximum confidence measurement, the 
probability that the state prepared really was the one identified is |, compared 
with + sin 20) for the minimum error measurement. 




e e 



Fig. 3. Graphs showing the maximum confidence (left) and minimum er- 
ror (right) figures of merit, for various values of the parameter 6 for the 
example discussed in the text. In each case the values achieved by the op- 
timal maximum confidence measurement are indicated by a dashed line 
and those corresponding to the optimal minimum error measurement are 
indicated by a solid line. 

3. 3. a. Other Similar Strategies 

A related strategy may be constructed by applying a worst case optimality crite- 
rion to the conditional probability considered here, P(pi\i) [80]. This approach 
does not allow for inconclusive results, but searches for the measurement for 
which the smallest value of P(pi\i) is maximised. This more complicated prob- 
lem is difficult to solve analytically, but may be cast as a quasi-convex opti- 
misation, for which efficient numerical techniques are available. An alterna- 
tive strategy allows inconclusive results to occur with a certain fixed probabil- 
ity, Pj, and maximises the probability of correctly identifying the state when 
a non-inconclusive outcome is obtained. For linearly independent pure states 
this approach interpolates between minimum error and unambiguous discrimi- 
nation [81,82]. For mixed states a similar approach is possible [83] , which may 
be interpreted as interpolating between a minimum error measurement and a 
maximum confidence strategy. It is clear that the probability of obtaining a 
correct result, renormalised over only the results which are not inconclusive, 
denoted Prc, can never be larger than the largest value of P(pi\i)max f° r a given 



set, regardless of how much we increase Pi. This upper bound is achieved by 
a maximum confidence strategy which only ever identifies the state(s) pi such 
that (from equation (58)) 

Ymax (p- l/2 P ,Pip- l/2 ) > Ymax (V 1/2 PjPjP^' 2 ) V P; , (66) 

while all other results are interpreted as inconclusive. Although it is difficult 
to find the optimal measurement for general Pi, it is indeed found that Prc is 
saturated at some value of Pj, and the maximum Prc achievable corresponds to 
the strategy outlined here [83]. 

3.3. b. Related problems - quantum state filtration 

Quantum state filtration refers to the problem of whether the state of a sys- 
tem is a given state \\jfi) or simply in any one of the other states in a given set 
J 7^ z - Th^ s problem is less demanding than complete discrimination 
among all possible states, and in the minimum error approach the probability 
of error may be smaller in the state filtration case [84] . For the maximum con- 
fidence measurement however, the optimality of the probability operator % in 
equation (57) is independent of the number and interpretation of other possible 
outcomes. Thus the confidence in identifying a given state from a set cannot be 
increased by considering this simpler problem. This figure of merit is depen- 
dent only on the geometry of the set, and in this sense can be thought of as a 
measure of how distinguishable p,- is in the given set. 

3.4. Comments on the Relationships Between Strategies 

The maximum confidence strategy was introduced as an analogy to unambigu- 
ous discrimination for linearly dependent states [79]. In fact, unambiguous dis- 
crimination is a special case of maximum confidence discrimination. The max- 
imum confidence measurement maximises the conditional probability P(pj\i). 
If this figure of merit is equal to unity for some state pi, the optimal measure- 
ment is such that, when outcome i is obtained, we can be absolutely certain 
that Pi was in fact the state received, corresponding to unambiguous discrim- 
ination. We can use the maximum confidence formalism to investigate when 
unambiguous discrimination is possible. Equation (52) may be written 

P / A I -\ PiTrjfrfti) fflT . 

PiTr(pi7Ci) + L j7 ti P ;Tr (p jlti) 

Clearly the limit of unity may be achieved if there exists any projector A,- for 
which Aj Y.j^i PjPj^i = while A;p,-A,- is non-zero, ftj is then any operator lying 
in the subspace with projector A;. This reproduces the known results that unam- 
biguous discrimination is possible between pure states if the states are linearly 



independent [58], and between mixed states if they have distinct supports [65]. 
More precisely, a measurement is possible which will sometimes allow us to 
identify p, unambiguously if p ; - has support in the kernel of Y.j^iPjPj- This 
condition is less restrictive than the previous, which does not hold in the case 
where it is possible to unambiguously discriminate some but not all states in a 
set. Unambiguous discrimination is still possible in this case, but some states 
are never identified. For example, it was pointed out by Sun et al [69], that it 
is possible to apply unambiguous discrimination to the problem of determin- 
ing whether a system is in a given state \ \j/o) or in either of two other possible 
states, \W2), even if the states span only two dimensions, and therefore 
are linearly dependent. This may be more easily understood as unambiguous 
discrimination between a mixed state and a pure state in two dimensions [65]. 
Let 

Po = IVoXVbl, 

Pi = — j— IVi>(Vil + — x— I^X^I=9|0>(0| + (1- 9 )|1>(1|(68) 

P1+P2 P1+P2 

where |0), |1) are the eigenkets of p\, < q < 1, and without loss of gener- 
ality we can write | y/b) = cos 10) + sin | 1). It is clear that the von Neumann 
measurement 

#o = IVbXVbl 

ft, = i-|VbXVbl (69) 

can unambiguously discriminate the two possibilities -if outcome 1 is obtained, 
we can say for sure that the state was pi, while the result is interpreted as 
inconclusive. However this measurement never tells us if the state was | y/b) . 
In this case it may be useful to consider unambiguous discrimination within 
the framework of maximum confidence measurements. It is then possible to 
construct a measurement which sometimes identifies p\ with certainty, but also 
sometimes identifies po as confidently as possible. In general an inconclusive 
result will also be necessary. 

Now suppose that instead of maximising the conditional probability in equa- 
tion (52) independently for each state in the set we choose to maximise a 
weighted average of these probabilities. We would then obtain as our figure 
of merit 

WW = % p (Wi\i) = Y,P(fr)P(i\fr), (70) 

i i 

which is precisely the figure of merit maximised by the minimum error 
measurement. Thus these two strategies can be thought of as applying a differ- 
ent optimality condition to the same quantity. The minimum error measurement 



also has the additional constraint that the operators {fy} must form a complete 
measurement, as it is never optimal to allow inconclusive results to occur. This 
constraint makes finding the optimal measurement a difficult problem, although 
the conditions which the optimal measurement must satisfy are known, as we 
have shown. By contrast, the maximum confidence strategy allows a closed 
form solution for an arbitrary set of states. In the special case where the maxi- 
mum confidence figure of merit is the same for all states p,- and no inconclusive 
result is needed, the two strategies coincide. More generally, it is clear by ex- 
amination of equation (70) that an upper bound for the minimum error figure of 
merit is given by the largest value of P{pi\i)max f° r a given set (i.e. the largest 
value of equation (58)). 

3.5. Mutual information 

In communications theory the performance of a communications channel is 
quantified not by an error probability but rather by the information conveyed. 
We can give a precise meaning to this by invoking Shannon's noisy channel 
coding theorem [85, 86], which states that the maximum communications rate, 
or channel capacity, is obtained by maximising the mutual information between 
the transmitter and receiver. If the transmitted message, A, is one of the set {a,} 
and the reception event, B, is one of the set {bj}, then the mutual information 
is defined to be 

»^=^ b MmMj))' (71) 

where the logarithm is usually taken to be base 2 so that the information is 
expressed in bits. For a quantum channel, the state p, is selected with probability 
Pi and the measurement result bj is associated with the probability operator ftj. 
It follows that the mutual information is 

H (A:B)=£ Pi Tr(ft*,)log('|^V (72) 

where p = p ; p\ . The maximum value of the mutual information is found by 
varying both the preparation probabilities, pt and the measurement strategies. 
This is a very difficult optimisation problem and there are very few exact so- 
lutions known [87, 88]. A scarcely simpler problem is to fix the preparation 
probabilities and then seek the maximum value to give what is referred to as 
the accessible information [89]. 

For two pure states, it is known that the mutual information is maximised 
if the states are prepared with equal probability and if the minimum error 




(Y#>iW> = «(l-fy)- (74) 



measurement is employed [88] . For three or more states, the accessible informa- 
tion is known if the states are equally likely to be selected and possess a degree 
of symmetry. In particular, for the so-called trine ensemble of three equally 
probable states (33), the accessible information is obtained with a generalised 
measurement with probability operators 

= §|1><1| 

s,^(>+f|o> 

fti = ^U|1>-^|0>)U(1|-^(0|). (73) 

Note that the accessible information is obtained not by maximising the prob- 
ability for determining the state but rather for eliminating one of the states so 
that 

1 

r 

A similar strategy works well for four equiprobable states arranged so as to form 
a regular tetrahedron on the Bloch or Poincare sphere [87]. For more states, 
optimal strategies have been demonstrated with fewer measurement outcomes 
than states [89]. 

3.6. No signaling bounds on state discrimination 

Up to now we have discussed the limits on quantum state discrimination by 
mathematically formulating figures of merit which may then be evaluated and 
compared for any allowed measurement by virtue of the generalised measure- 
ment formalism. It is interesting to note however that it is possible to place tight 
bounds on state discrimination without any reference to generalised measure- 
ments, by appealing to the no signaling principle, the condition that information 
may not propagate faster than the speed of light. 

Although entanglement appears to allow space-like separated quantum sys- 
tems to influence one another instantaneously, it may be shown that quantum 
mechanical correlations do not allow signaling [90-93]. Further, due to the im- 
plications of this in reconciling quantum mechanics with special relativity, it 
has been suggested that the no-signaling principle be given the status of a phys- 
ical law, which may be used to limit quantum mechanics and possible exten- 
sions of it [94, 95]. In practice, bounds on the fidelity of quantum cloning ma- 
chines [96,97], the success probability of unambiguous discrimination [98,99], 
and the maximum confidence figure of merit [100] have been derived using no- 
signaling arguments. In particular, the no-signaling principle may be used to 



put a tight bound on unambiguous discrimination of two pure states [98], and 
to derive the maximum confidence strategy [100]. We will discuss these two 
cases here. 



3. 6. a. Unambiguous discrimination 
Consider the entangled state 

|¥> = Vpo\ ¥o)l\0) r + y/l-p \ Yi)l\1)r (75) 



where | Vo)l, | ¥\)l are non-orthogonal states of the left system (given by equa- 
tion (10)), and \0)r, \\)r forms an orthonormal basis for the right system. The 
reduced density operator of the right system may be obtained by taking the 
partial trace over the left system, and is given by 



'Pq{\ -po)cos20 l-po 



According to the no-signaling principle, no operation performed on the left sys- 
tem may be detected by measurement of the right system alone, as this could 
be used to signal faster than light. Thus, after any physically allowed trans- 
formation of the left system, the reduced density operator of the right system 
must remain the same. Consider now a measurement which discriminates un- 
ambiguously between the states | y/b)z,> | ty\ )l of the left system. If outcome is 
realised, which occurs with some probability qo, the right system is projected 
into the state \0}r, due to the inital entanglement between the systems. Simi- 
larly outcome 1 projects the right system into state \1)r, with probability q\. 
There is also the inconclusive result, which transforms the right system to some 
as yet unknown state 

/ O oo O oi \ 

*=(ftfr) (77) 

with probability g?. No signaling implies 

pR -{o qi ) +q '{ P ? p, 11 )■ (78) 

The task is then to minimise qi subject to the above condition and the conditions 
P? > 0, qo, q\ , qi > 0. This optimisation is straight-forward [98], and remarkably 
gives precisely the Jaeger and Shimony result [55] discussed in Section 3.2. 
Thus the no-signaling condition may be used to place a tight bound on the 
success probability of unambiguous discrimination, without any reference to 
generalised measurements. 



3.6.b. Maximum confidence measurements 

The confidence in identifying a given state \ as a result of a state discrimi- 
nation measurement on the ensemble {\Yi),Pi} is simply the probability that it 
was state \\ffj) that gave rise to the measurement outcome observed. Consider 
now the entangled state 

N-l 

\V)= L VPi\W)L\i)R (79) 

!=0 

where { | \ffi) 1} are non-orthogonal states of the left system which together span 
aD < iV-dimensional space, and {\i)r} form an orthonormal basis for the right 
system. Now for any measurement performed on the left system of the entan- 
gled pair, the probability that it was state | \\fj)L which gave rise to the observed 
outcome is equivalent to the probability that the right system is now found in 
state \ j) r. Thus, if measurement outcome j causes the right system to transform 
to p R y, we can write 

P(¥j\j)=RU\PR\j\j)R- (80) 

It may be shown (by reference to the Schmidt decomposition of l^) [101]), 
that although the right system lies in an iV-dimensional Hilbert space, it is con- 
fined to a D-dimensional subspace (with projector denoted Pd below) due to 
the entanglement with the left system. The key point then, is to notice that any 
operation performed on the left system cannot take the right system out of this 
subspace, since this could be detected with some probability by a measurement 
on the right system alone, and thus could be used to signal. Thus R(j\pR\j\j)R 
is restricted by the requirement that p R y lies in this subspace, and is clearly 
bounded by the magnitude of the projection of U)r onto this space 

P(¥j\j)=R(j\pR\j\j)R<R(j\PD\j)R. (81) 

Further, this bound is achievable and is equivalent to that obtained previously 
(equation (58)) [100]. Similar arguments may be applied to the mixed state 
case, and the maximum confidence strategy is derived in a natural way from 
no-signaling considerations. Finally, we note that in the case where the states 
are linearly independent, D = N, and the right system occupies the 
entire iV-dimensional Hilbert space. In this case the limit is unity, corresponding 
to unambiguous discrimination. 

4. State Discrimination - Experiments 

The theory of generalised measurements has a mathematically appealing gen- 
erality in that it depends only on the overlaps of the possible states to be dis- 
criminated and on the probabilities that each was the state prepared. The nature 



of the physical states be they nuclear spins, optical coherent states or electronic 
energy levels in an atom, is unimportant. In performing experimental demon- 
strations, however, the choice of physical system is of primary importance. We 
require a physical system in which superpositions are relatively stable, easy to 
prepare and to manipulate and also, of course, to measure. For all these reasons, 
the system of choice has usually been photon polarisation and forms the basis 
of our review. 

4.1. Photon Polarisation 

At least within paraxial optics [102] the electric and magnetic fields are very 
nearly perpendicular to the direction of propagation of the light. It is conven- 
tional to define the polarisation by the orientation of the electric field in this 
transverse plane [103]. Two orthogonal polarisations then correspond to fields 
in which the electric fields are oriented at 90° to each other. The polarisation of 
a single photon is an excellent two-state quantum system, or qubit [4, 101] as 
we can identify the states of horizontal and vertical polarisation with the logical 
|0) and 1 1) states of a qubit: 

\0) = \H) \l) = \V). (82) 

Other polarisations are superpostions of these states. In particular, as illustrated 
in Fig. 4, linear polarisation at ±45 
are the superpositions 

|+45°) -- 
|-45°) -- 
\L) - 

\R) - 

The set of all possible pure states of polarisation can be represented on the 
surface of a sphere, the Poincare sphere [104, 105], which is an equivalent rep- 
resentation to the Bloch sphere used for qubits in quantum information the- 
ory [4, 101]. States of optical polarisation can be changed coherently by delay- 
ing one polarisation compared with the orthogonal polarisation, usually by a 
quarter or half a wavelength, using birefringent wave plates. A combination of 
three suitably oriented half- and quarter-wave plates can perform any desired 
transformation, corresponding to a rotation on the Poincare sphere through any 



to the horizontal and circular polarisations 



^(|0> + U» 

^(|0>-ID) 
^(|o}+;|i» 

-L(|0}-i|l)). (83) 




Fig. 4. Polarisation of light as a two-level system, or qubit. 



desired angle about any desired axis. In this way we can realise any desired 
single-qubit unitary transformation. 

It is important, in order to realise generalised measurements, to be able to 
superpose fields and also to be able to spatially separate different polarisations. 
These tasks can be performed using beam-splitters and polarising beam split- 
ters. For fully overlapping modes with the same frequency, we can write the 
output annihilation operators in terms of those for the input modes. For a sym- 
metric polarisation-independent beam splitter we find [57] 

a 3 = ra x +ta 2 

a A = ta x + ra 2 , (84) 

where the input and output modes are labelled as in Fig. 5. 

Enforcing the canonical commutation relations at for the output modes con- 
strains the reflection and transmission coefficients: 

|f| 2 + |r| 2 = l, rt * + tr* = 0. (85) 



A polarising beam splitter is designed to transmit horizontally polarised light 
and to reflect vertically polarised light. This means that input and output anni- 




Fig. 5. A beamsplitter can be used to supeipose or separate field modes. 
The input and output modes are labeled with the associated annihilation 
operators. 



hilation operators are related by 



a 4 = ai a 4 =a 2 - (06) 



In correlating photon polarisation and direction, a polarising beam splitter can 
be used to prepare (filter) light with a desired polarisation or, in conjunction 
with photodetectors placed in each output beam, to measure the polarisation. 
They also allow us to perform different transformations on two orthogonal po- 
larisations and this is crucial in enabling us to perform generalised measure- 
ments. 

We should make one important point before describing any of the experi- 
ments that have been performed and this is that they have not been done with 
single photon sources. All of them rely on linear optical elements and processes 
and for these, the single-photon probability amplitudes and the associated prob- 
abilities behave in the same way as the amplitudes and intensities of classical 
optics. Some of the experiments have been performed at light levels in the quan- 
tum regime, however, and this suggests strongly that the devices will work in 
the same way given single photon sources and detectors. 
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Fig. 6. Schematic of the Barnett-Riis experiment achieving the Helstrom 
bound for state discrimination between two pure states. 

4.2. Minimum Error Discrimination 
4. 2. a. Two states 

The simplest minimum error problem is, as we have seen, that for two pure 
states (10). For the photon polarisations described above these correspond to 
two states of linear polarisation, oriented at +6 and —0 to the horizontal, so 
that the angle between them is 20, for a range of values of 6 between and 
k/A. If the two states are prepared with equal prior probability then, as we have 
seen, the minimum error measurement corresponds to a familiar von Neumann, 
or projective, measurement with two projectors associated with the orthogonal 
states (13). For optical polarisation, this corresponds to measuring the polari- 
sation at 45° to the horizontal. Thus the minimum error strategy in this case is 
a simple polarisation measurement. The experiment to test this [106] was per- 
formed using light pulses with on average 0. 1 photons per pulse prepared in the 
desired polarisation state by use of a Glan-Thompson polariser oriented so as 
to produce polarised light at the angle +6 or —6 to the horizontal. These were 
then measured using a polarising beam splitter oriented so as to transmit light 
polarised at +45° to the horizontal and to reflect the orthogonal polarisation. 
The experimental apparatus is shown in Fig. 6. Results were found to be in 
excellent agreement with the Helstrom value (12) for equal prior probabilities: 




(87) 



4.2.b. Three or four states 



Finding a minimum error strategy for discriminating between more than two 
states is, in general a difficult problem, although very general statements about 



the solution can be made for qubits [44]. For the trine ensemble of three 
equiprobable linear polarisation states 



Vi 3 > 



H) 




(88) 



and the tetrad ensemble of four equiprobable states 
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(89) 



the square root measurement is readily shown to give the minimum probabil- 
ity for error. The trine states are states of linear polarisation separated by 60° 
and the tetrad states are two states of linear polarisation and two of elliptical 
polarisation. In each case they form a set of maximally separated points on the 
surface of the Poincare sphere, as shown in Fig. 7. 




Fig. 7. Representation of the trine (left) and tetrad (right) states on the 
Poincare sphere. 



In order to measure more than two orthogonal states of polarisation we need 
to introduce an additional degree of freedom and a suitable one is provided by 
the path of the light beam. We shall illustrate this idea only for the trine ensem- 
ble, the experimental set-up for which is shown in Fig. 8. Details for the tetrad 
ensemble can be found in [107]. The input polarising beam splitter separates, 
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Fig. 8. Schematic of the Clarke et al. experimental realisation of minimum 
error discrimination between the trine states. PBS 1-3 = polarising beam 
splitters, HWP1-3 = half waveplates, PD1-3 = photodetectors. For details 
see [107]. 

coherently, the polarisation components by transmitting the horizontal compo- 
nent and reflecting the vertical component. This allows us to manipulate these 
components independently. A half- wave plate placed in the path of the horizon- 
tally polarised beam rotates the polarisation so that only the requisite fraction 
of it is transmitted at the next polarising beam splitter. The vertically polarised 
beam is transformed into a horizontally polarised beam so that it can be recom- 
bined coherently with what is left of the originally horizontally polarised beam. 



Thus the polarisation of this combined beam is analysed using a final polaris- 
ing beam splitter. The photon ends up in one of the three photodetectors and 
we can think of each of the trine polarisation states being transformed into a 
superposition of exit paths from the interfermometer [107]: 

\v4) - ^|M>+-L|rt)-J_|M>, (90) 

where a photon in path Pi will be detected in photodetector i. This measure- 
ment device is optimal as it correctly identifies the initial polarisation state with 
probability |. 

4.3. Unambiguous Discrimination 




Fig. 9. Schematic of the Clarke et al. experimental realisation of unam- 
biguous discrimination between two non-orthogonal polarisation states. 



Unambiguous discrimination between non-orthogonal polarisation states, 
like the minimum error measurements described above, requires an extension 



of the two-dimensional state space and an interferometer is the ideal device for 
implementing this. The idea is depicted in Fig. 9. We have two possible linear 
polarisation states, each of which has a larger vertical component of polarisation 
than horizontal. The double-headed arrows are intended to represent the magni- 
tudes of the probability amplitudes at various places. The input polarising beam 
splitter reflects the vertical component and transmits the horizontal component. 
The mirror in the upper arm of the interferometer transmits just enough for the 
reflected field to have the same amplitude as that in the lower arm. If the photon 
escapes from the interferometer at this point then the measurement is inconclu- 
sive. If it does not, however, then the amplitudes for the vertical and horizontal 
fields are equal in magnitude and become othogonal when recombined at the 
output polarising beam splitter. At this stage they can be discriminated with 
certainty using a final, suitably oriented, polarising beam splitter. 




Angle a (degrees) Half angle of separation (degrees) 



Fig. 10. Results of the Clarke et al. experimental realisation of unambigu- 
ous discrimination between two non-orthogonal polarisation states. The 
rate of inconclusive results is shown on the left, and the error rate for each 
initial state given on the right. A model taking into account the non-ideal 
characteristics of the beamsplitters was used to generate the non-ideal the- 
ory plots in each case. For full details see [109]. Copyright (2001) by the 
Americal Physical Society. 

The first demonstration of unambiguous discrimination between non- 
orthogonal polarisation states used a specially selected length of polarisation 
maintaining fibre [108]. This has the effect of maintaining, with low losses, 
the horizontal component of polarisation but attenuating the orthogonal ver- 
tical component. If the length of the fibre is chosen appropriately then any 
light exiting the fibre will be in one of two orthogonal polarisations and so can 
be discriminated with certainty. An interferometric experiment has the advan- 
tage that it allows us to measure also the ambiguous results. The experimental 
setup [109] is very similar to that for the minimum error discrimination of the 
three trine states, but with the three measured outputs now corresponding to the 
unambiguous identification of the states | i//q) , | ty\ ) an d to the ambiguous result. 
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Fig. 11. Schematic of the experimental apparatus used to demonstrate 
maximum confidence discrimination between three elliptical polarisation 
states. PBS 1-4 = polarising beamsplitters, QWP1-2 = quarter waveplates, 
HWP 1-4 = half waveplates, PDO-2, PD? = photodetectors. 



The results of this experiment are shown in Fig. 10. 

We have presented here only the simplest experiments, but more compli- 
cated problems have also been addressed. In particular, unambiguous discrim- 
ination has been demonstrated for three possible states and also between non- 
orthogonal pure and mixed states [110]. The generalised measurements de- 
scribed here have all been implemented using light but the principles are in- 
dependent of the system used. It should be noted, particularly in the context of 
quantum information, that non-orthogonal states encoded in the energy levels 
of atoms or ions can similarly be subjected to generalised measurements with 
unoccupied levels used to assist in the process [111]. 



4.4. Maximum confidence measurements 

Maximum confidence discrimination between three symmetric states in two 
dimensions (the simplest possible case), has also been demonstrated experi- 
mentally using the polarisation of light as a qubit [1 12, 1 13]. In the experimen- 
tal realisation, the states given in equation (59) were encoded in the left/right 
circular polarisation basis, and the set-up distinguished between the elliptical 
polarisations 

|Vo) = cos6\R) +sin0|L), 

|Vi) = cos0|i?)+e 27r '/ 3 sin0|L), 

\Y 2 ) = cos9\R}+e- 27ti/3 sm9\L}. (91) 

The maximum confidence measurement for these states, as we have seen, has 
four outcomes, one corresponding to each possible state and one inconclusive 
result. The apparatus used is shown in Fig. 11, and again features an interfer- 
ometer to provide the extension to the state space necessary to realise all four 
outcomes. In this set-up, the outcomes and ? are grouped together in one out- 
put arm of the interferometer, while the other arm corresponds to outcomes 1 
and 2. Thus two detectors placed in the output arms A and B of the appara- 
tus would realise the two outcome generalised measurement described by the 
POM {7T? + 7To, ft\ + 7^2} ■ In fact this set-up is completely general, and by ap- 
propriate choice of orientations of the waveplates QWP1 and HWP1-3, may be 
used to implement any such two-element measurement. Further, any N outcome 
measurement may be, in principle, performed using a number of such modules 
in series [113, 114]. Thus, after PBS2, two orthogonal modes in arm A corre- 
spond to outcomes and ?, while two orthogonal modes in arm B correspond 
to results 1 and 2. Finally HWP4, QWP2 and PBS3-4 are used to separate these 
modes, which are then detected at the photodetectors in the output arms. The 
results of this experiment demonstrated an improvement over the minimum er- 
ror measurement in the confidence figure of merit for linearly dependent states 
and are shown in Fig. (12). 

4.5. Mutual information 

The strategies for maximising the mutual information for two pure states re- 
quire us to perform a minimum error measurement [88]. With more states we 
require, in general, a generalised measurement [87, 89]. For the trine and tetrad 
states we obtain the accessible information by eliminating, with certainty one 
of the possible states. This can be realised experimentally using the same de- 
vice as that devised for the minimum error measurement, simply by interchang- 
ing everywhere the horizontal and vertical components of polarisation. In other 
words, the device for maximising the mutual information for the trine or tetrad 
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Fig. 12. Results of the maximum confidence discrimination experiment. 
Graph shows the confidence figure of merit for measurement outcomes 
(red), 1 (green) and 2 (blue). Lines indicate the theoretical value of the 
figure of merit for the maximum confidence (dotted) and minimum error 
(dashed) measurement strategies. Shaded areas indicate the range of values 
consistent with a non-ideal model, taking into account errors introduced 
at the polarising beamsplitters, for details see [113]. Figure reproduced 
from [112], copyright by the American Physical Society. 



states is the same as that for minimising the error in discriminating between a 
set of states orthogonal to the given trine or tetrad. For more than four states of 
linear polarisation, we can maximise the mutual information by performing a 
measurement with just three possible outcomes [89]. 

The experiment to realise the minimum error discrimination between two 
non-orthogonal polarisation states [106] also provided the maximum mutual 
information. For the pure states (10) with = 15°, corresponding to linear po- 
larisations at an angle of 30°, the mutual information derived from the measure- 
ments was [18] 

# 2states (A : B) = 0. 196 ±0.007 bits, (92) 

which compares well with the theoretical value of 0.189 bits. For the trine and 
the tetrad [107] we found 

H trine (A : B) = 0.49liffi bits 
H tetmd (A:B) = 0.363+g;^ 4 bits, (93) 

which should be compared with the theoretical values of 0.585 bits and 0.415 
bits respectively. It is important to note that these experimental values are good 
enough to demonstrate the necessity of performing a generalised measurement 



as the theoretical maximum mutual information for the trine and tetrad states us- 
ing conventional projective measurements are 0.459 bits and 0.31 1 bits respec- 
tively. A subsequent more careful experiment produced a substantially higher 
value for the mutual information obtained using the trine ensemble of 0.556 
bits and also realised the optimal measurements for sets of five and seven states 
of linear polarisation [115]. 

5. Conclusion 

Quantum theory allows us to prepare, at least in principle, even the simplest 
system in an uncountable infinity of different ways. The polarisation for a sin- 
gle photon, for example, can be prepared in a state that correcsponds to any 
point on the surface of the Poincare sphere. It is a fundamental consequence 
of the superposition principle, however, that no measurement can discriminate 
with certainty between two non-orthogonal quantum states. The challenge for 
quantum state discrimination is to perform this task as well as is possible. 

It is evident that selecting the best possible measurement in any given sit- 
uation usually requires us to perform a generalised measurement. These are 
general in the sense that they represent, not just projective measurements of 
the kind envisaged by von Neumann [21], but rather the most general measure- 
ments possible within the confines of quantum theory. The POM formalism is, 
as we have seen, a remarkable tool in the search for optimal measurements. 
That this is the case is a consequence of the facts that (i) any set of probability 
operators satisfying the required properties listed in section 2 correspond to a 
possible quantum measurement and (ii) all possible measurements can be de- 
scribed by an appropriate set of probability operators. This means that we can 
separate the mathematical task of finding the theoretically optimum measure- 
ment from the practical one of designing a measurement to implement it. 

We have seen that optimal measurements have been found to minimise the 
error in identifying the state, discriminate between states unambiguously and 
to determine the state with the maximum level of confidence. These similar 
sounding goals are all subtly different and correspond, for the most part, to 
quite distinct measurements. We have also discussed yet another task relevant to 
quantum communications, that of maximising the information transferred. The 
problem of state discrimination acquired much of its significance from consid- 
ering the problem of quantum communication and in particular from quantum 
cryptography [4-9]. All existing implementations of these are based on optics 
and it is perhaps not surprising, therefore, that it is in optics that the exper- 
imental advances in quantum state discrimination have been made. We have 
described, in particular, how quantum limited measurements have been devised 
on optical polarisation to realise the optimal measurements for detection with 
minimum error, unambiguous discrimination as well as detection with maxi- 



mum confidence and maximum mutual information. As quantum information 
technology develops the ability to optimise performance by performing the best 
possible measurements can only become more important. 
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